Approximation Algorithms for Probabilistic Reasoning: Sampling and Iterative Inference
نویسنده
چکیده
The complexity of the exact inference increases exponentially with size and complexity of the network. As a result, the exact inference methods become impractical for large networks and we seek to approximate the results. A variety of approximation methods exist. This research focuses on two approximation methods for finding posterior marginals P (xi|e) in Bayesian networks: iterative belief updating (defined by Pearl [Pearl 1988]) and sampling. The belief updating is an exact inference method for singly-connected networks. It can be applied to loopy networks to obtain approximate answers. The algorithm is based on message passing: in some order, each node computes and sends messages to its neighbors incorporating the latest messages it recieved. In a singly-connected network, we can order nodes so that it will be sufficient for each node to pass one messages in each direction. In a loopy network, the nodes compute several iterations of messages to achieve convergence (or to demonstrate the lack of convergence). Thus, belief updating in loopy networks is often referred to as Iterative Belief Propagation or IBP. Although IBP generally computes only approximate answers, it is known to perform extremely well in several special classes of networks such as coding networks and noisy-or networks. At the same time, we know that in some instances IBP does not converge or generates approximate answers far from correct. Currently, we do not have any methodology that would allow us in general case to predict the convergence of IBP or provide some practical error bounds on the approximate marginals it computes. In this research work, we examine the influence of the -cutset criteria on the convergence and quality of approximate marginals computed by IBP. We conjecture the -cutset (defined as a cycle-cutset with extreme posterior marginals) has effect similar to an observed cycle-cutset which breaks the loops and leaves the network singly-connected. We prove that the conjecture is true for Bayesian networks without evidence and show that the error in the approximate marginals computed by IBP converges to 0 as tends to 0. We provide empirical support for instances of Bayesian networks with evidence. The idea behind the sampling methods for Bayesian networks is to generate a set of samples (where a sample in a vector space X = {X1, ..., XN} is just an assignment of values to the elements of vector X) and then estimate the posterior marginals of interest from samples. In general, the quality of the approximate answers depends primarily on the number of samples generated and the approximate values converge to the exact values as number of samples increases. However, the sampling variance increases with the size of the sampling space. In this research work, we focus on the the variance reduction techniques on the example of the Gibbs sampler for Bayesian networks. It is obvious that we can achieve the reduction in variance by sampling only a subset of variables. However, the implication is that we have to carry out a lot more analytical computations which may render the whole approach impractical. We demonstrate that we can reduce sampling space efficiently if we take into consideration the underlying network structure. The time/space complexity of the exact inference in Bayesian networks is exponential in the induced width of the graph. In our sampling scheme, called w-cutset sampling, we sample a subset of variables (called a cutset) that is carefully chosen to reduce the complexity of the graph bounded by the induced width w. We analyze the problem of finding an optimal w-cutset of a graph (NP-hard in general case) and provide a heuristic algorithm for finding w-cutset in practice. We show empirically that w-cutset sampling typically finds better approximate answers than standard Gibbs sampler for a range of w values although its performance eventually deteriorates as w increases.
منابع مشابه
Load-Frequency Control: a GA based Bayesian Networks Multi-agent System
Bayesian Networks (BN) provides a robust probabilistic method of reasoning under uncertainty. They have been successfully applied in a variety of real-world tasks but they have received little attention in the area of load-frequency control (LFC). In practice, LFC systems use proportional-integral controllers. However since these controllers are designed using a linear model, the nonlinearities...
متن کاملImproving the Efficiency of Approximate Inference for Probabilistic Logical Models by means of Program Specialization
We consider the task of performing probabilistic inference with probabilistic logical models. Many algorithms for approximate inference with such models are based on sampling. From a logic programming perspective, sampling boils down to repeatedly calling the same queries on a knowledge base composed of a static and a dynamic part. The larger the static part, the more redundancy there is in the...
متن کاملURDF: Efficient Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard Rules
We present URDF, an efficient reasoning framework for graph-based, nonschematic RDF knowledge bases and SPARQL-like queries. URDF augments first-order reasoning by a combination of soft rules, with Datalog-style recursive implications, and hard rules, in the shape of mutually exclusive sets of facts. It incorporates the common possible worlds semantics with independent base facts as it is preva...
متن کاملA Hybrid Approach to Inference in Probabilistic Non-Monotonic Logic Programming
We present a probabilistic inductive logic programming framework which integrates non-monotonic reasoning, probabilistic inference and parameter learning. In contrast to traditional approaches to probabilistic Answer Set Programming (ASP), our framework imposes only comparatively little restrictions on probabilistic logic programs in particular, it allows for ASP as well as FOL syntax, and for ...
متن کاملNon-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model
Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approxi...
متن کاملIterative Algorithms for Graphical Models 1
Probabilistic inference in Bayesian networks, and even reasoning within error bounds are known to be NP-hard problems. Our research focuses on investigating approximate message-passing algorithms inspired by Pearl’s belief propagation algorithm and by variable elimination. We study the advantages of bounded inference provided by anytime schemes such as Mini-Clustering (MC), and combine them wit...
متن کامل